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INTRODUCTION 


HERE has been a long standing interest in the morphology of the 

female pelvis amongst anthropologists and anatomists. In 1830 
and 1844, Weber and von Stein recognized four main types of pelves. In 
1885, Turner described three types (1). A long narrow oval type was 
considered to resemble that found among anthropoid apes. The round 
type was considered to be the classical female type. A wedge-shaped 
pelvis was felt to simulate that of the male. A transverse oval type was 
also described. 

Obstetrical and radiological interest in the pelvic shape began in 1934- 
38 due to the emphasis placed by Caldwell and Moloy’s investigations 
on the role of morphology in the mechanism of labor (1). These investi- 
gators went one step further and described intermediate types which were 
formed by various combinations of these four parent types. In order 
to be able to define these intermediate forms they divided the pelvis 
into anterior and posterior segments at the widest transverse diameter 

*From the Department of Obstetrics, West Baltimore General Hospital, 
Baltimore, Maryland. 

+ Present address, New York State Dept. of Health, Albany, New York. 


— 


] | 
| 


144 LILIENFELD, TREPTOW, AND DIXON 


of the inlet. They then classified each segment into one of the four 
main types. The various pelvic types that are formed by this procedure 
are illustrated in Fig. 1.1. This classification has been widely accepted 
by obstetricians in America and Great Britain and has been utilized as a 
basis for prognostication of labor. 

Other investigators, in particular Nicholson (2) and Allen (3), have 
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felt that typing of the pelvis on the basis of appearance is subject to a 
large degree of error and that such classification should be more firmly 
based on quantitative data. So far as we know, however, no one has 
attempted to investigate the reliability of pelvic classification into the 
Caldwell-Moloy morphological types, nor has any attempt been made 
to determine whether these types can be characterized by measurement 
rather than by sensory impressions and judgments. 


1 Acknowledgment is made to Dr. C. T. Javert and to the North Carolina 
Medical Journal for permission to use this figure. 
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We have felt that a method of study recently utilized in evaluating a 
related diagnostic tool might be fruitfully applied to x-ray pelvimetry. 
Yerushalmy (4), in assessing the value of the various types of films used 
in tuberculosis case-finding, studied the variations of different readers 
in interpreting the same set of films (inter-individual variation) and the 
inconsistency of interpretation of a reader in reading the same film twice 
(intra-individual variation). As a result of these studies he concluded 
that none of the methods used was superior to any of the others in find- 
ing cases of tuberculosis. We felt that a similar analysis of the inter- 
pretation of the qualitative features of the pelvis, and in particular, of 
the Caldwell-Moloy classification, would be of value in assessing the 
diagnostic reliability of such features. It is plain that features which 
are inconsistently interpreted in any considerable proportion of cases 
must be of doubtful diagnostic value. (Naturally, diagnostic value is 
not proved by consistency of interpretation; but consistency seems to us 
an essential of a diagnostic procedure. ) 

We also felt that it would be possible to utilize the measurement data 
as a basis for the classification of the pelvis in order to determine whether 
it could be accomplished in as consistent or more consistent a manner. It 
is with these objects in mind that the authors present the results of such 
an investigation. 


SOURCE OF MATERIAL AND METHOD OF STUDY 


The x-rays employed in this study were obtained from the files of the 
West Baltimore General Hospital. The majority of these x-rays were 
taken ante-natally on patients who were subsequently delivered at the 
same institution. Some were taken during or after labor. Both private 
and clinic patients were represented. A number of the x-rays taken 
were on patients who were delivered at six other Baltimore hospitals. 
These likewise included both private and clinic patients, The patient 
was referred ante-natally to the West Baltimore General Hospital by 
either the private physician or obstetrical resident. The series, as a 
whole, represents a highly select group since the examination was under- 
taken only when the referring physician felt that there was something 
clinically questionable about the pelvis. In the case of primiparous 
breeches the selection was at a minimum since many physicians feel 
that routine x-ray pelvimetry should be performed in such cases. 

The method of pelvimetry employed was the isometric technique de- 
scribed by Steele and Javert (5). Two x-ray views of the pelvis were 
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obtained—lateral and antero-posterior. From these films two general 
types of information were obtained: (1) quantitative, and (2) qualita- 
tive. The quantitative data consisted of certain diameters and areas 
calculated therefrom. The qualitative features included some informa- 
tion about the fetus. From these data the reader formed an impression 
as to whether or not a difficult labor on a pelvic basis was to be expected. 

On the antero-posterior view the following diameters were measured : 
the greatest transverse of the inlet, the anterior transverse of the inlet, 
the inter-spinous, the true transverse of the mid-pelvis (described by 
Allen (3) ) and the inter-tuberous. On the lateral film, the following 
were measured: the obstetric conjugate, the anterior and posterior 
sagittal of the inlet, the antero-posterior of the mid-pelvis, the antero- 
posterior of the outlet, the posterior sagittal of the outlet and the pos- 
terior sagittal of the mid-pelvis. In the last, two different measure- 
ments were used: posterior sagittal (A), which represents the diameter 
from the ischial spines to the sacrum lying parallel to the inlet, and 
posterior sagittal (B), which was measured between the ischial spines 
and a posterior endpoint lying at the junction of the fourth and fifth 
sacral segments. These measurements were taken as described by Steele 
and Javert (5), except as otherwise indicated or described. 

While the study was in progress, the authors determined what was felt 
to be a more accurate method of measuring the posterior sagittal of the 
inlet. This will be described in a separate report. 

In addition to these diameters the areas of the different pelvic planes 
were calculated according to the formula given by Nicholson (2): 


Area = X AP/2 


where 7 and AP represent the transverse and antero-posterior diameters 
of that particular plane. Three areas were calculated: (1) the inlet, (2) 
the midpelvic, using the interspinous as the transverse diameter, and 
(3) the midpelvic with the true transverse representing the transverse 
diameter. Since areas could not be calculated for the outlet, sums of 
the posterior sagittal and inter-tuberous were utilized as being indica- 
tive of the capacity at this plane. 

The qualitative features of the pelvis were recorded as follows: The 
anterior and posterior segments of the inlet were classified according to 
the Caldwell-Moloy morphologic classification. The sacral segments were 
differentiated and counted. The spines were evaluated as to their promi- 
nence. The lateral bore of the pelvis which represents the relationship 
of the first three sacral segments to the posterior border of the symphysis 
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was estimated. An impression of the inclination of the sacrum, the 
curvature of the first three sacral segments, the prominence of the tip of 
the sacrum and the slope of the side walls was made. In addition to the 
specifically pelvic features, the presentation and position of the fetus 
and the obliquity and flexion of the fetal vertex (if presenting as such) 
were recorded, 

Approximately 225 x-ray pelvimetries were read. Fourteen of these 
were eliminated from the study since they were considered as being 
technically unsatisfactory by at least one reading. In several instances 
among the remaining 211, individual diameters were not measured since 
the reader could not visualize satisfactorily the endpoints. This is the 
reason for the numerical discrepancies in several of the tables. 

The films were numbered at random and were read by reader A. 
After completion of this reading the series was renumbered and reread 
by A. The films were read twice in the same manner by reader B. 
However, due to limitations of time, B only read about one-half of the 
films read by A. The comparison of the first and second readings of A 
will indicate the ability of a reader to agree with himself when reading 
the same set of films twice (intra-individual variation). The comparison 
of the first reading of A with the first reading of B will afford a measure 
of the extent of variability between two individuals reading the same 
set of films (inter-individual variation). 


QUALITATIVE FEATURES 


Caldwell-Moloy classification 


As previously stated, the anterior and posterior segments of the inlet 
were classified into the four Caldwell-Moloy types: anthropoid,-android, 
gynecoid, and platypelloid. Tables 1 and 2 show the results of repeated 
classifications for 211 films read twice by reader A, and for 103 read 
twice by reader B. 

Considering first the anterior segment, we note that of 62 films classi- 
fied as anthropoid on first reading, 31 were so classified on second read- 
ing. Of 91 films classified as gynecoid on first reading, 51 were put in 
the same category on second reading. Out of the 211 films read, there 
were 110, or 52 per cent, which reader A placed in the same category 
at both readings. Considering reader B, we find that out of 103 films 
read, there were 69, or 67 per cent, which were placed in the same 
category on both readings. 
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For the posterior segment, the results are 70 per cent agreement for 
reader A and 57 per cent agreement for reader B. 

In Table 3 are shown similarly the first readings of A against the 
first readings of B. It will be observed that for the anterior segment 
there was agreement in 54 of 103 cases, or 52 per cent, and for the pos- 
terior segment, agreement in 60 of 103 cases, or 58 per cent. 

Taking the data as a whole, it appears that in about 60 per cent of the 
cases a classification is made which is repeated on a second reading, 
either by the same or a different reader. In 40 per cent a second reading 
was differently classified from the first. 


TABLE 1 


Comparison of two interpretations of shape of pelvis by reader A 


FIRST SECOND READING OF A 

BEADING 

OF A Anterior segment Posterior segment 
Gyne- Anthro- Android Platy- Total Gyne- Anthro- Android Platy- Total 
coid poid pelloid coid poid pelloid 

Gynecoid 51 9 20 ll 91 49 19 13 3 84 
Anthropoid 21 31 10 0 62 8 79 1 0 88 
Android 14 12 25 0 51 6 6 6 3 21 
Platypelloid 4 0 0 3 7 1 0 4 13 18 

Total 90 52 55 14 211 64 104 24 19 211 

TABLE 2 


Comparison of two interpretations of shape of pelvis by reader B 


FIRST SECOND READING OF B 

READING 

OF B Anterior segment Posterior segment 
Gyne- Anthro- Android Platy- Total Gyne- Anthro- Android Platy- Total 
coid poid pelloid coid poid pelloid 

Gynecoid 22 8 3 4 37 26 7 8 9 50 
Anthropoid 3 37 3 0 43 8 18 0 0 26 
Android 6 3 3 1 13 4 0 7 4 15 
Platypelloid 3 0 0 7 10 1 0 3 8 12 

Total 34 48 9 12 103 39 25 18 21 103 
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TABLE 3 
Comparison of interpretations of pelvic type by two readers 


FIRST FIRST READING OF B 
READING 
OF A Gynecoid Anthropoid Android Platypelloid Total 
Anterior segment 
Gynecoid 22 12 4 4 42 
Anthropoid 8 22 3 0 33 
Android 6 8 6 2 22 
Platypelloid 2 0 0 4 6 
Total 38 42 13 10 103 
Posterior segment 
Gynecoid 25 1 6 4 36 
Anthropoid 16 25 2 0 43 
Android 7 0 4 3 14 
Platypelloid 1 0 3 6 10 
Total 49 26 15 13 103 


This is not to be taken as implying that for 60 per cent of all films 
the classification is free from doubt. The proportion of such films is 
not known but is considerably less than 60 per cent; for a doubtful film 
will in the nature of the case sometimes be placed in the same category 
on two successive readings so that the 60 per cent agreement includes 
some films which would on further readings show disagreement. 


Other qualitative features 

For the various other qualitative features of the pelvis the results are 
summarized in Table 4. No extended discussion of this table need be 
given. It is clear that none of the qualitative features are determined 
with anything approaching certainty; and that on the whole the pelvic 
classification is rather less consistently determined than are the other 
features. 

It seems reasonably clear that morphological types no more reliably 
distinguishable than these—or, if you will, readers no more reliable in 
their decisions as to typing—leave much to be desired. It has therefore 
seemed desirable to attempt to see whether pelvic measurements could 
be used to characterize the Caldwell-Moloy types. The remainder of 
this paper is devoted to this question. 


the 
ant 
he | 
1g, 
ng 
Total 
88 
21 
18 
ll 
| 
otal 
26 

i 


el 8¢ 08 68 OL 6L 68 
bP tr 86 Ig OF 6L 621 col 

Q 
jO 
8¢ 96 Lt 8L £91 803 seutds yo souourmoig 
najad fo saanjvag 
+9 9g £01 6F og 018 yo dry, 
+9 £01 IL 201 £8 11z 

. cL 6L 18 £01 GL 11z “ON 
8 
6¢ 19 6° £01 OL 112 quouides 10110}80g 
> 19 69 £01 oll queues 
% ‘ON pvor % ‘ON % ‘ON 
q puv VY q 
AONALSISNOO AONGDISISNOO TVOCIAIGNI-V4LNI 


150 


pun y ssopvas fiq pvas a4am ayy yoryn uo fo pup saquny 


| | 


pito-transverse, Occipito-posterior, 


X-RAY PELVIMETRY 151 


MEASUREMENT DATA AS A BASIS FOR PELVIC CLASSIFICATION 


Pelvic classification on the basis of visual judgment of shape is, as 
we have seen, subject to large uncertainties, Other workers have appre- 
ciated that this would probably be the case, and that measurement might 
provide a more precise means of classification, Allen (3) in particular 
has suggested various ratios of measurements as a possible basis for 
classification. He did not, however, make any attempt to see how far 
such measurements would agree with visual typing. 


Precision of measurements 

Before attempting to relate the measurement .data to pelvic types, we 
felt it desirable to investigate the precision of the measurements them- 
selves. For this purpose the entire set of films was measured twice by 
reader A, and about half the total was measured by reader B. Time 
permitted reader B to make remeasurements only on certain diameters. 
Table 5 presents a summary of the results. In this table are shown, first, 
the means and standard deviations of the various measurements, as de- 
termined from A’s first readings. (We point out that the use of the 
mean of A’s two readings would have given essentially the same result.) 
The next three columns give: (a) the standard deviation of the differ- 
ence between A’s first and second readings; (b) the standard deviation 
of the difference between A’s first and B’s first readings; and (c) the 
standard deviation of the difference between B’s first and second read- 
ings, for those readings for which an adequate number of second readings 
were available. 

In interpreting these standard deviations some points should be borne 


Position: Occipito-anterior, Occi) 


3 in mind. First, the importance of an error of measurement depends in 
a large part on the variability of the population being measured. If the 
Ff population has a standard deviation of 1 cm., an error of measurement 
<5 of 3mm. is far less serious than if the population has a standard devia- 
az tion of 5mm. Hence it is necessary to compare standard deviations 
rf which represent measurement errors with the standard deviation of the 
fa population, rather than with any absolute standard, in deciding whether 
4 the measurements have any value. Second, it should be recognized that a 


considerable element of judgment enters into the measurements, as re- 
gards the selection of endpoints on the film. It is to be expected that 
this judgment may well show greater variability from one observer to 
another than for the same observer from one time to another. 

Finally, it must be recognized that different observers may well have 
systematic differences in judgment as to endpoints. Such differences 
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may or may not be of importance, depending in part on their size and 
in part on the use to which the measurements are to be put. 

The question of what should be taken as the standard error of measure- 
ment requires some consideration. If we are concerned only with the 
intercomparison of the measurements of a single observer, it is reason- 
able to estimate the standard error from repeated measurements by this 
observer. But as soon as more than one observer is in question, we must 
plainly take account of differences between observers, both systematic 
and random. We believe that this latter situation is in general the more 
important. We have therefore taken as our standard error of measure- 
ment the standard deviation of the difference between A’s and B’s first 
readings, divided by V2. 

We have finally to consider the question of bias. We have recorded 
the mean difference between A’s and B’s first readings, and have indi- 
cated with an asterisk the values which are more than 3 times their 
standard error in absolute value. We again call attention to the fact 
that the bias is to be judged with reference to the variability of the 
population of measurements. It is clear that some measurements are 
subject to a bias which is uncomfortably large. We may point out in 
particular the anterior sagittal of the inlet and the posterior sagittal of 
the outlet. 

It was interesting to discover after the completion of our study that 
Nicholson in 1943 (8) had attempted to estimate the errors associated 
with the measurement of certain diameters which were obtained with his 
stereometric technique. After instructing his secretary in the techniques 
of measurement, he had her read 30 sets of films which he had also read. 
He analyzed these measurements and obtained an estimate of the errors 
present. In Table 6 the errors obtained by these two readers are con- 
trasted with the standard errors obtained in the present study. 


TABLE 6 


Comparison of standard errors of measurement given by Nicholson with 
those of present study 


STANDARD ERROR OF MEASUREMENT (cm.) 


DIAMETER Nicholson Nicholson’s Present 
secretary study 
Conjugate 0.04 0.16 0.24 
Transverse 0.10 0.14 0.28 
Inter-spinous 0.20 0.45 0.27 
Antero-posterior 
of mid-pelvis 0.25 0.55 0.28 


Note: Nicholson’s data were published as probable errors 
in millimeters. They have been converted to standard 
errors in centimeters for comparison with ours. 
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We note that in the case of the conjugate and transverse diameters 
our standard errors are larger than those reported by Nicholson. In the 
other two diameters they are reasonably similar, 

The discrepancies found may be due to one or more of the following 
reasons: Nicholson instructed his secretary in the technique of selecting 
endpoints, etc., immediately prior to her reading the films. This prob- 
ably decreased the error. In addition, the radiographic techniques used 
were different. With his method Nicholson attempts to attain a higher 
degree of precision in his measurements than was attempted in this study. 
There is also a small amount of underestimation of his standard error 
because he used n in his computations rather than the more appropriate 
n—1. 

In view of these considerations, especially of the differences in tech- 
niques, we think it noteworthy that these estimates agree as well as 
they do. 


Relationship of measurement data to pelvic types 


It appears that the measurement data are interpreted with sufficient 
consistency to attempt to utilize ratios as a basis for classification. Since 
the classification is concerned with the shape of the inlet, every possible 
ratio of the four diameters of the inlet was studied. For the posterior 
segment, only one ratio was available, the posterior sagittal/transverse 
which we shall refer to as the posterior sagittal ratio.2 For the anterior 
segment, three ratios were calculated as follows: (1) anterior sagittal/ 
anterior transverse, (2) anterior sagittal/transverse, and (3) anterior 
transverse/transverse. 

The procedure for evaluating these ratios was as follows: Due to the 
consistency of the measurements it was felt that only a small error would 
be introduced by selecting one reading for computing these ratios, The 
reading selected was the first reading of reader A. In all, 103 x-rays 
had four interpretations (two by reader A and two by reader B). It 
was logical to assume that any x-ray that had been classified as being of 
one type on three or more of these readings had a good probability of 
being characteristic of that type. The mean and standard deviations 
of the distribution of these four ratios were computed for the gynecoid 
and anthropoid types as determined by three or more readings. For 
the android and platypelloid types, the ranges were computed due to the 
small number of x-rays classified in these classes (Table 7). The fre- 


* This posterior sagittal ratio differs from that used by Allen. He defined it as 
posterior sagittal/antero-posterior ratio. 
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TABLE 7 
Relations between pelvic type and ratios of certain measurements 

PELVIC TYPE NUMBER MEAN STANDARD MEAN BANGE 

DEVIATION +2 8.D. 

Posterior sagittal ratio 

Gynecoid 26 0.46 0.04 .38-—.54 

Anthropoid 30 0.55 0.04 47-63 
Android 4 -36-.48 
Platypelloid 8 .30—.44 

Anterior sagittal/anterior transverse ratio 

Gynecoid 27 0.51 0.03 45-57 

Anthropoid 31 0.55 0.05 -45-.65 
Android 6 -50-.64 
Platypelloid 4 34.44 

Anterior sagittal/transverse ratio 

Gynecoid 27 0.44 0.03 .38-.50 

Anthropoid 31 0.49 0.04 41-57 
Android 6 -44-.56 
Platypelloid 4 -34-.44 

Anterior transverse/transverse ratio 

Gynecoid 27 0.87 0.02 83-91 

Anthropoid 31 0.88 0.03 82-94 
Android 6 .80-.88 
Platypelloid 4 88-92 


quency distributions of the four ratios of these 103 x-rays are illustrated 
in Figs. 2 to 5. Superimposed on these distributions are the limits 
(mean + 2 standard deviations) of those classified as gynecoid and 
anthropoid and the ranges of those classified as android and platypelloid. 

A review of these distributions reveals several interesting points, In 
the case of the posterior sagittal ratio there appears to be a definite dis- 
tinction between those classified as anthropoid and as platypelloid. How- 
ever there is a good bit of overlapping between the android and gynecoid 
types. This appears to substantiate the physical basis of the relation 
between these types. When the anterior segment is considered, the situa- 
tion becomes complex since in all of these ratios there is a good deal of 
overlapping between all of the types. It is interesting to speculate as 
to the reason for the apparently greater degree of success associated with 
using these ratios for classifying the posterior segment rather than the 
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anterior segment. The range of the posterior sagittal ratio is .36. For 
the three ratios used in classifying the anterior segment, the ranges are 
18, .24 and .22. As the range for the distribution decreases there is 
more overlapping between these types. It is quite logical to assume 
that if a ratio has a larger range, one is more able to divide the distribu- 
tion into distinct categories than if the range were smaller, The lack 
of success in utilizing ratios for classification of the pelvis is probably a 
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result of the fact that the distribution of these ratios has too narrow a 
range to allow division into four distinct categories. 

To summarize, we have not been able to find any consistent relation 
between the Caldwell-Moloy classifications and the measurements which 
we have taken, with the single exception of the anthropoid-platypeiloid 
types of the posterior segment. If a classification by measurements is to 
be used to replace the Caldwell-Moloy scheme, it is clear that other 
dimensions must be found. 
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PROGNOSIS 


DIXON 


In addition to interpreting the quantitative and qualitative features 
of the pelvis, each reader was asked to prognosticate the outcome of labor 
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from the x-rays. The predictions were classified as follows: (1) dystocia, 


(2) questionable dystocia, and (3) no dystocia. 
readings of the two readers are presented in Table 8. 


The results of both 
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TABLE 8 
Comparison of two interpretations of prognosis of labor by readers A and B 


SECOND READING 


FIRST READING Questionable No 
Dystocia dystocia dystocia Total 
Reader A 
Dystocia 4 4 0 8 
Questionable dystocia 2 20 12 34 
No dystocia 0 7 162 169 
Total 6 31 174 211 
Reader B 
Dystocia 2 4 0 6 
Questionable dystocia 5 5 12 22 
No dystocia 0 7 68 75 
Total 7 16 80 103 


Examination of this table reveals that in so far as the two readings 
were concerned there was considerable overlapping between the dystocia 
and the questionable dystocia group. Likewise, there was overlapping 
between the questionable dystocia and no dystocia groups. At no time 
did any of the readers classify a case in the dystocia group on one read- 
ing and then in the no dystocia group on the second reading. 

In discussing prognosis it is a natural desire to compare the prognosis 
with the actual outcome of labor. Despite the fact that this series was 
considered to be too small to be very significant, we thought it would be 
interesting to do this. Consequently, an attempt was made to obtain 
the hospital records of these cases. Seventy-five per cent of the records 
were obtained. The remaining 25 per cent were either lost or the patients 
had moved out of the city and had been delivered elsewhere. From the 
records obtained, those cases who had cesarean sections performed either 
electively or for non-pelvic indications, such as placenta praevia, were 
eliminated. Out of 211 patients whose x-rays were read by reader A, 
only 124 labor records were obtained for comparison. The series inter- 
preted by B was too small to attempt any correlation. 

In order to compare the outcome of these labors with the prediction, 
two of the authors placed the outcome of each labor into one of the 
three following categories: dystocia, questionable dystocia and no dys- 
tocia. The differentiation between dystocia and questionable dystocia 
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was somewhat arbitrary. All those cases which had a trial of labor and 
subsequent cesarean section were naturally placed into the dystocia group, 
The questionable dystocia group included mostly those who had some- 
what prolonged labors with occipito-posterior or transverse arrests. It 
is quite possible that some cases of primary uterine inertia were included 
in this group. Admittedly the definition and classification of difficult 
labors is in need of revision, especially if one desires to evaluate x-ray 
pelvimetry. 

In Table 9 is presented the data correlating the prediction with the 


TABLE 9 
Correlation of first reading of A with outcome of labor 


OUTCOME OF LABOR 
PROGNOSIS ACCORDING Questionable 
TO FIRST READING OF Dystocia dystocia No dystocia Total 


Dystocia 3 0 1 4 
Questionable dystocia 5 3 4 12 
No dystocia 4 ll 93 108 

Total 12 14 98 124 


outcome of labor. The inaccuracies are very apparent. Out of 12 labors 
classified as being dystocic, reader A stated that 4 of these should have 
no dystocia. The dystocic and questionable dystocic labors totalled 26. 
Out of these reader A predicted 15 should have no dystocia. On the 
other hand out of 16 predictions of dystocia and questionable dystocia, 
11 were verified by the course of labor. Likewise out of 108 no dystocia 
predictions only 4 had dystocia and 11 had questionable dystocic labors. 
We do not believe that the accuracy of the predictions is much different 
from that of other published data. Unfortunately, due to differences in 
classification it would be unfair to compare the results.® 


COMMENT 


The purpose of this study is to investigate the extent of variation 
present in classifying the female pelvis according to the Caldwell-Moloy 


* The classification of labors with which several workers have evaluated their 
predictions varies from “easy and difficult labors” (Moir) (6) and “ assisted 
and unassisted labors” (Nicholson) (2), to the statistical method of weighting 
various factors (Allen) (7). There are as many classifications as there are 
investigators. 
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classification. We have observed that this classification is associated 
with a large degree of both intra-individual and inter-individual varia- 
tion. An attempt was then made to utilize various quantitative relation- 
ships of the measurement data as a basis for classification. We found 
that these ratios were not of sufficient range to allow their division into 
four distinct categories. 

Since it is impossible to generalize the results from a study which 
utilized only the Steele-Javert technique, it would seem imperative to 
investigate other methods of pelvimetry. It might be possible to find 
one in which such variation is almost negligible. If on the other hand 
it is discovered that the other methods are subject to the same error, 
it would seem necessary to conclude that it is virtually impossible to 
classify the pelvis, and as a necessary corollary, it is impossible to predict 
the mechanism of labor by these methods. 

As has been suggested above, this failure to find a measurement classi- 
fication which could serve as a satisfactory replacement for the Caldwell- 
Moloy method of classification should not be taken as proof that no such 
method can be found. It more probably indicates that other dimensions 
would be more relevant. It remains to be seen whether a workable set 
of dimensions can be found which will give an objective validity to the 
Caldwell-Moloy morphological classification. 

Whether the attempt to find such a set of dimensions would be worth 
making seems to us somewhat doubtful. The practising obstetrician is 
principally interested in prognosis of labor, and little interested in mor- 
phological classification as such. To him, at least, it might seem more 
profitable to direct our energies to the study of measurement data in rela- 
tion to the prognosis of labor. In such a study it would also be neces- 
sary to examine and redefine the criteria of “dystocia.” Such a study 
would be of inestimable value to the science of obstetrics. 


SUMMARY AND CONCLUSIONS 


A series of x-ray pelvimetries taken according to the Steele-Javert 
technique were studied in an attempt to determine the consistency of 
classification of pelvic types according to the method of Caldwell-Moloy. 
One reader interpreted approximately 210 pelvimetries on two occasions 
and a second reader read approximately half of these on two separate 
occasions. The various readings were analyzed. We found that there 
exists a large degree of both intra-individual and inter-individual varia- 
tion in classifying the pelvis. After the measurements of the various 
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pelvic diameters were found to be made with only a small degree of error, 
we attempted to utilize the relationships between these measurements as a 
basis for classification. We discovered that the ranges of these ratios 
were too narrow to permit their consistent division into four distinct 
categories. 

Several recommendations can be derived from this study: 


(1) Other methods of pelvimetry should be investigated in a similar 
manner. 

(2) If the results of such an investigation are similar to those ob- 
tained in this report, a more exhaustive and detailed study of the rela- 
tionships between measurements and mechanism of labor should be 
carried out. 

(3) Simultaneously with this study an attempt should be made to 
redefine and reclassify the terms and types of “ difficult labors.” 
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STATISTICAL MODELS BEARING ON THE 
SEMANTICS OF CORRELATION 


Il. THE NON-REPLACEMENT MODEL 


BY LANCELOT HOGBEN, F.R.S. AND J. A. H. WATERHOUSE 
Department of Medical Statistics, University of Birmingham 


1. INTRODUCTION 


Se previous communication of this series explored the properties 
of a model which brings into focus two issues: (a) the necessity 
of drawing a clear-cut distinction between linear concomitant variation 
and linear regression, in particular vis d@ vis the use of the correlation 
ratio as a criterion of linearity; (b) the impropriety of employing the 
correlation ratio as an index of explained variation when the relation- 
ship between the relevant variates is one of concurrence in contra- 
distinction to consequence as defined in the same context. The model 
which is the subject of this communication focuses attention on the 
second issue and points to an alternative approach to the semantic 
credentials of partial correlation. In its simplest form the problem to be 
dealt with is expressible in the following terms: if one player A draws 
a cards simultaneously from a pack of n cards without replacing them 
before a second player B draws b cards simultaneously from the residual 
pack of (n—a) cards, what are the implications of the constraint 
imposed by A’s prior choice on B’s score? 

A comprehensive treatment of the problem calls for definition of 
two ways of scoring, respectively specified as taxonomic and represen- 
tative. By the taxonomic method we here imply scoring the result of 
A’s choice or that of B in terms of the number (or proportion) of cards 
of a given class, e.g. hearts. By the representative method we signify 
scoring the result by a numerical total or mean, as we are free to do 
if we assign to each card of the pack a unit score, e.g. by numbering 
them consecutively. This distinction tallies with the dichotomy more 
customarily and variously referred to as sampling of attributes and 
measurements or quantitative and qualitative statistics. In fact, both 
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modes of expression in common use are somewhat misleading. If the 
end in view is to compare two treatment procedures for anaemia, one 
may score the result taxonomically in our sense of the term by specifying 
what proportion in each group have a red blood cell count above 3 
million per c.mm., or representatively by specifying for each the mean 
red blood cell count of the group as a whole. In either case, the criterion 
of classification is quantitative and in neither case is it, in fact, a 
measurement. 

Needless to say, we are at liberty to regard the taxonomic as a 
limiting case of the alternative method, when the classification is binary, 
since we may then assign to successes and failures respectively unity and 
zero as unit scores; but it will bring into clearer focus issues which the 
non-replacement model serves to clarify, if we consider separately and 
initially the 2-class card pack universe. It is also convenient to do so 
for a reason not sufficiently emphasised in current expositions. The 
specification of a universe as binary does in fact impose on it the 
specific algebraic form of the unit sample distribution, viz. (p+ q)', 
as does the method of rank scoring, in the absence of ties, when the 
method of scoring is representative. Otherwise, logic alone does not 
suffice to furnish a unit-sample distribution function for the analytical 
treatment of representative scoring without recourse to empirical pos- 
tulates, e.g. the assumption of an approximate normality. 

Initially, we shall employ the notation of the previous communication. 
In particular My and Vo, respectively stand for the mean and variance 
of the distribution of B-scores associated with the column-heading 
A-score c. Similarly, M,, and stand for the mean and variance 
of the distribution of A-scores associated with the row-border B-score 
a =r. For the variance of the distribution of the B-score and A-score 
means we write respectively V(Mya) and V(M,»). For the mean of the 
variances of B-score and A-score distributions we write respectively 
and M(V,»). As before we distinguish by 2, or Fy and Ey, 
or EF,» respectively the operations of extracting the mean value of all 
cell entries, of column or row means and af scores within a column or 
within a row. In this notation 


Ey: Eu = E=E,- Eq. 
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2. NUMERICAL ILLUSTRATIONS OF THE NON-REPLACEMENT MODEL 


The following examples will suffice to illustrate the properties of 
the model under consideration. 


(a) Taxonomic scoring. From a 6-card pack consisting of 2 clubs 
and 4 hearts, the first player (A) simultaneously draws 2 cards and 
the second player (B) draws 3 from the residual pack of 4. A’s 
heart score (0, 1 or 2) distribution is given by successive terms of 
(2 + 4)/6, viz. 


A’s score 0 1 2 
Frequency (X15) 1 8 6 
The residual packs from which B draws are as follows: 
A’s heart score 0 1 2 
Residual : 
hearts 4 3 2 
clubs 0 1 2 


If A’s heart score is 0, that of B is necessarily 3 for the 3-fold draw. 
If A’s heart score is 1, the distribution of B’s heart score of 0, 1, 2 or 3 
accords with the terms of (1 + 3)‘*)/4°). If A’s heart score is 2, the 
appropriate binomial is (2 + 2)‘*/4), Thus we get a frequency table: 


B’s heart score 
0 1 2 3 
when 2, = 0 0 0 1 
= 0 4 4 0 


To obtain the correlation grid we have to weight the above by corre- 
sponding frequencies of A’s score, and obtain: 
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From the above we obtain: 


Va= 16/45; Vy = 2/5; 
kg = — 3/4; kan = — 2/3; 
Zp) — 4/15 = = kan V 
fo = —1/V2. 


In view of what follows the reader will note that the sampling 
fractions of the players are respectively : 


fa = 2/6 = 4; 3/6 = 4; 
fo)/(1 — fa) (1 —fo) = Tar”. 


(b) Representative scoring. From a pack of six cards consisting of 
the ace, 2, 3, 4, 5 and 6 of clubs, each player takes 2 cards without 
replacement, recording as his score the total number of pips. Since A 
draws twice, he may select any one of *C, 15 combinations, and B 
may choose any one of *C, ==6 residual combinations which we may 
set out in the following schema: 


A’s choice Possible choice of B 
12 34 35 36 45 46 56 
13 24 25 26 45 46 56 
14 23 25 26 35 36 56 
15 23 24 26 34 36 46 
16 23 24 25 34 35 45 
23 14 15 16 45 46 56 
24 13 15 i6 35 36 56 
25 13 14 16 34 36 46 
26 13 14 15 34 35 45 
34 12 15 16 25 26 56 
35 12 14 16 24 26 46 
36 12 14 156 24 25 45 
45 12 13 16 23 26 36 
46 12 13 15 23 26 35 
56 12 13 14 23 24 34 


The corresponding scores of the samples set out in the foregoing schema 
are as follows: 
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A’s choice Possible B-scores 
Sample Score 

12 3 7 8 9 9 10 11 
13 4 6 
14 5 56 7 8 8 9 ll 
15 6 5 6 8 7 9 10 
16 7 S 
23 5 5 6 7 9 10 Il 
24 6 fs 
25 7 4 5 7 7 9 10 
26 8 4 6.8.4 -3° 9 
36 9 6&6 8 
45 9 ¢ 


Each of the 2-fold samples either A or B can draw admits of 2 
permutations. So the number of permutations corresponding to particu- 
lar scores in this lay-out are in the same ratio as the number of com- 
binations. Consequently, the required frequencies of particular A-scores 
associated with particular B-scores are as exhibited above; and we may 
summarise the result as a correlation table (see Table 1) in which the 
drift of figures is downwards from the top right-hand corner to the left 
lower corner. 


TABLE 1 
Correlation table for example (b) 
A 
3 7 8 9 10 11 TOTAL MEAN VARIANCE 

3 1 1 2 l 6 9.0 10% 

4 1 1 1 1 1 6 8.5 3540 

5 ‘ <a. ee 2 2 2 1 2 12 8.0 4 
B 6 2 2 1 1 12 7.5 
7 1 — ao 2 2 1 1 18 7.0 7% 
8 1 :. 2 1 12 6.5 

9 2 1 2 12 6.0 4 
10 1 1 1 6 5.5 3540 

TOTAL 6 12 12 6 6 90 7 144 
MEAN 90 85 80 7.5 70 65 60 55 5.0 7 — ain 
VARIANCE | 1% 4 4 % | % — | — 
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From the data contained in the correlation table we obtain: 
Cov(2o, 2p) = — 7/3; 14/3 V,; 
= — 

For the inter-class and intra-class variances we have: 
V (Mav) = 7/6 = V(Moa) ; 
M(Vav) = 7/2 = M(Vou) ; 

V (Mav) + M(Vav) = 14/3 = Va; 

V (Moa) + M(Voa) = 14/3 = Vo; 

V(Moa)/Vo = V(Mav)/Va =} = Tar’. 


The sampling fractions are fg = 4 = fp, 
(fa* fo)/(1— fa) (1 — fo) — 4 = Tar’. 


3. THE 2-CLASS UNIVERSE—-TWO PLAYERS ONLY 


The simplest variant (Fig. 1) of the non-replacement model is the 
case when player A records as his score the number of hearts he selects 
and B does likewise. We shall use p= (1—gq) for the proportion of 
hearts in the whole pack, and pp = (1 —q») for the proportion of hearts 
in the residual pack after A has taken z, hearts from it. By definition 
therefore : 

Po — (3.1) 
In conformity with the elementary properties of the non-replacement 
distribution : 


M, = ap = E,(2a) ; Mya = bp» (3.2) 
Va—=a(n—a)pq/(n—1); 

Via = b(n — a— b) pogn/(n—a—1). (3. 3) 
Whence we derive: 


b(n—a—b) (np—t) (n—a—np+2,) 


(n—a—1) n—a n—a 
(np — 2.) (nq—a+ 2), 
Vu = + (84 


CORRELATION MODELS 169 
Frequency “Grid 
Score Products and Frequencies 2.19". 39” | 13. 39 ay 
1] 3..13%.3916 .19%39" | 3.19% 39 
< 2] 3.19% 396.19 39” | 3.1042 39” 
O*l | 3.39 | a 
2.1339) 13.39 
i 
3.13239 6.13739 3 .13%39” from a full pack 
2*1 2*2 
3.13739") 6.13% 39 3.1339” 
S*1 
h 
of 
‘ts 
on 
1) Fig. 1. NON-REPLACEMENT MODEL | 
nt The two players respectively draw 3 (A) and 2 (B) cards from a full pack 
recording their heart scores. The common denominator of the entries in the q 
frequency grid on the right is 52. | 
2 
3) That linearity of regression is a necessary property of the model ‘8 
follows simply from (3.2) by recourse to the notation employed in the 7 
previous communication of this series. We first note that: 
bn b 
My — Ea(Mve) —— 
bnp bM, 
(3. 5) 


i] 
if 
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By substitution in (3. 2): 


Moa — My = — 2a) /(n (3. 6) 

For brevity we shall write kyg = b/(n—a), so that: 
Moa — My = kena (Ma — 2a). (3. 7) 


In the same way, we arrive at the conclusion that regression of the 
score of A on that of B is also linear. It is important to notice that 
the mean score of B is the same whether A does or does not draw. 
From (3.5) we have: 

bnp abp 


We are now able to establish the covariance of the two players’ scores, 


v1Z., 
Cov(2a, Zp) = Zp) — (3. 9) 


From the structure of any correlation grid: 


Zp) = E,(2a Moa), 
whence by (3.7): 
E(2q* = Ea(taMy + ata — kaa’) 
= MyE (ta) + — 


= + — 
From (3.9): 
Cov(2a, 2») = — kal — 


Cov(La, Ly») = — Vag —= — ab: pqg/(n—1), (3. 10) 


whence we obtain: 
Tad = — V V./V>. (3. 11) 


To evaluate (3.11) we have to determine V,, the variance of the 
B-score distribution. It is convenient to make use of the variance of 
the distribution of the total score of A and B as follows: 


V (ta + Zo) = Va + Vo + 2 Cov(xa, Zp) ; 


3) 


9) 
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hence from (3.10): 
V (aa of Zp) = (1 Ve V>. (3. 12) 


The total score of player A and player B is, of course, what A’s score 
would be, if he drew (a + 6) instead of a cards. Hence by (3.3): 

V (ta + (a+ (n—a— b) pg/(n— 1) 

= (a+ b)(n—a—b)V,/a(n—a). 
Whence by substitution in (3.12) in accordance with the meaning we 
attach to koa: 
(a+ b)(n—a—b) 2b 
V, = b(n — b) Va/a(n—a). (3. 13) 


Finally from (3.3) : 


Vy = b(n — b) pq/(n—1) (3. 14) 


The last equation signifies that the variance of the score distribution 
of player B is exactly the same as if A had not drawn any cards pre- 
viously. We may now evaluate (3.11): 


a(n—a) 
b? a(n—a) 


We may denote as follows the sampling fractions: 


fa=a/n ; b/n. 
Thus we have: 


fa fo 

When choice is exhaustive in the sense that B takes all the residual 
(n—a) cards after A has chosen, fy = (1—f,) ; and hence ra, — —1. 
When the universe is indefinitely large, so that any sampling fraction 
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is approximately zero, (1—f,) =1=(1—f>) and ra,—0, a result 
which is consistent with the assumption that any constraint imposed 
by non-replacement is relevant only in the domain of sampling from 
the finite universe. 

In virtue of linear regression as shown in § 2 of Part I of this series: 


V (Moa) ab 
(n—a)(n—d)’ 
V (Moa) (3. 17) 


(n—a) 


4. PARTITION OF VARIANCE IN THE 2-CLASS UNIVERSE 


We have now seen that regression is necessarily linear in both 
dimensions when we correlate two players’ taxonomic scores in the 
universe of sampling with no replacement. Accordingly, we might 
expect that the square of the product-moment index would be, as it is 
in fact, equal to the ratio of the variance of the column means (B-scores 
for a particular value of z,) to the variance of the B-score distribution 
as a whole. On the other hand, we have already disclosed good enough 
reason for not regarding this ratio as a measure of explanation or so- 
called coefficient of determination. In any meaningful sense of the 
epithet, the possibility of a partition of the variance of the second player’s 
score into components respectively explained and not explained by the 
antecedent choice of A signifies that the variance of B’s actual score 
distribution would in fact be less, if player A did not choose any cards. 
In fact, we have already seen that the antecedent choice of player A 
does not alter the variance of B’s score distribution, since (3. 14) defines 
what the latter would be, if B drew the 6 cards initially. The result 
exhibited in (3.14) is sufficiently paradoxical to merit re-examination 
vis a vis the customary breakdown of variance in accordance with the 
universal property of any grid, viz., 


Vo = M(Vra) +°V (Moa) (4. 1) 
Since M(Via) = E.(Voa), we can write in accordance with (3. 4): 


b(n —a—b) 


M(Voa) = (n—a)?(n —a — 1) [np(n— a— np) 
+ (2np —n + a) — J. 


7) 
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But V.+a*p? and ap, so that we have, after 
reduction : 
bn(n —a— b) 


From (3.17): 
ab n(n—a—b)V, 
Vo— V(Moa) = (n— b) (n—a)(n—b) (4.8) 
By substituting (4.2) and (4.3) in (4.1), we thus get: 
n(n—a—b)V,__ bn(n—a—b) 
(n—a)(n—b) (n—a)(n—1)” 
pq. (4. 4) 


In conformity with the customary equation for the analysis of 
variance, we thus arrive by a different route at the conclusion stated 
above, namely that the variance of B’s score distribution is exactly the 
same whether player A does or does not draw first. We may express 
this result in the following way: if our aim is to partition the variation 
of B’s score in such a way as to exhibit a component fraction attribut- 
able to the antecedent choice of A, variance is not a suitable measure 
of variation. In any case, we can attach no meaning consonant with 
its use as a coefficient of determination or measure of so-called explained 
variation to the proportionate contribution of the variance of the 
column means (t.¢., mean B-score associated with particular values of 
A’s score). 

The admitted artificiality of the model under discussion does not 
detract from the logical interest of the conclusion last stated. Except 
as a caveat against the well-recognised pitfalls of successive sampling 
at random from a repository of records, it may be difficult to formulate 
of real situations, with respect to which one would customarily employ 
the technique of correlation or regression, a law of constraint exactly 
comparable to the one which is implicit in the foregoing treatment. 
None the less, our model challenges attention, if only for the reason 
that we do commonly employ the technique of correlation in situations 
about which our knowledge of the causal nexus is at least nebulous. 
From that point of view, it emphasises forcefully a conclusion stated 
in the preceding communication of this series. 

The umpire bonus model of the foregoing communication disclosed 
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a class of situations in which: (a) regression is linear within the 
domain of consequence, as there defined, and the customary partition of 
variance is applicable in the sense that the square of the product- 
moment index is a just measure of explanation; (b) regression is 
usually non-linear in the domain of concurrence and a meaningful 
partition of variance is in any case inconsistent with the law of con- 
comitant variation. At first sight, we might be tempted to regard the 
score of player B as consequential with respect to that of player A in 
the same sense that the score of player A or of player B is consequential 
with respect to that of the Umpire of the Model of Part I of this series. 
We have to thank Doctor C. P. Winsor for encouraging us to enlarge 
the semantic framework of situations in which correlations may arise 
by distinguishing as constrained a relationship which is in no sense 
concurrent and is only superficially consequential. For a consequential 
relationship as heretofore defined is essentially asymmetrical with respect 
to order of choice, but non-replacement imposes no asymmetry on the 
mean outcome of either choice. When this is true, as also for the class 
of models illustrated in § 8 below, no meaningful partition of variance 
is admissible. 


5. REPRESENTATIVE SCORING—-ELEMENTARY RESULTS 


We shall now explore the assumption that the universe consists of 
more than 2 classes each specified by a numerical score, as when we 
number consecutively the cards of a pack. In that case, of course, 
we are sampling from a rectangular universe; but we shall proceed, 
as in the previous communication of this series, without introducing 
any particular postulates concerning the unit-sample distribution func- 
tion. As a representative score it is more convenient to employ the 
score-sum than the mean, e.g., if A draws a 3-fold sample with 2, 5 
and 7 pips, his score is 2+5-+%—14. In conformity with our 
prescription of the law of the model, A draws an a-fold sample 
simultaneously from the n-fold pack and B draws without replacement 
a b-fold sample from the (n —a)-fold residual pack. We shall denote 
the score of A and B respectively defined as above by s, and s». For 
the sum of all the card pack scores, i.e. the score corresponding to an 
n-fold sample, we shall use s,. Where necessary to specify the maximum 
value which a unit (single-card) score (r) may attain, we shall denote 
it by r—m. If we place no restriction on the unit-sample distribution 
of the pack, it will be necessary to specify the number (C,) of cards 
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therein with unit-score r. As elsewhere we shall employ M, and M, 
respectively for the mean values of the scores (i. e. score-sums) of players 
A and B, denoting by My, the mean value of the score-sum of player B 
associated with a particular score c= s, of player A. It will also be 
convenient to use M, and V, for the mean and variance of the unit 
sample ; that is, 

M, = s,/n, 


V, = 3C,(r— M,)?/n. 


We noted in § 3 that, in the taxonomic case, M, and V», the mean 
and variance of B’s score, are precisely what they would be had B drawn 
first. This point is worth a somewhat more detailed discussion. It 
will be as easy to consider the situation for representative as for 
taxonomic scoring; we remind the reader that the taxonomic score is a 


special case of the representative score, so that any result established 


for the latter will hold equally for the taxonomic score. 
We shall establish the following: 


Theorem. Let A draw a cards without replacement, and let B there- 
after draw b cards without replacement. The probability that B shall 
draw a specified set of b cards, averaged over all of A’s draws, is inde- 
pendent of the number of cards drawn by A. 


Proof. The total number of permutations of n cards is n! The 
number of permutations in which b specified cards occupy 6} assigned 
places is (nb)! Hence the probability that B draws b specified cards 
in an assigned order is (n—6)!/n! There are 6! permutations of the 
b specified cards, whence the probability that B draws these cards in 
some order is b!(n—b)!/n! But this probability is independent of a, 
the number of cards drawn by A, which is the result we were to prove. 

It will be noted that the theorem is stronger than the result in § 3, 
in that it goes beyond the mean and variance to establish that the prob- 
ability distribution of B’s possible draws is independent of the number 
of cards drawn by A. 

We may note as a special case that the probability of a particular 
score at a single draw is independent of which draw is in question. 

We shall now proceed to evaluate M, and M,, in conformity with 
the condition of non-replacement. For M, we have 


M, = E(3a;) = aM, —as,/n, (5.1) 
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since the expected value of the sum of a draws is the sum of the 
expectations at each draw, and these, by our theorem above, are all 
equal. For My, we have at once 


Mra = b(8n — 8a)/(n —a), (5.2) 


since B is drawing from a pack with n—a cards of which the score 
SUM iS — Sq. 

For B’s mean score, regardless of the antecedent choice of A, we have, 
in view of our theorem, 


My = bs,/n. (5. 3) 
It is instructive to note that we can also derive this from the relation 
M, = E,(Mr), 
for we have 
b b 
Ea(Mra) [sn — Ea(Sa) (sn — Ma), 


and substituting the value of M,, we have easily 
Eu(Moa) = bs,/n. 


6. NON-REPLACEMENT WITH REPRESENTATIVE SCORING 


In virtue of (5.1) to (5.3), we may now establish the theorem 
that regression is linear in both dimensions without restriction on the 
nature of the unit-sample distribution, and therefore that it is equally 
true with respect to non-replacement choice from a rectangular or 
normal universe. By (5.2) and (5.3): 


b bs, 
— My ——— (Sn — Sa) — 
b b 8a 


Whence from (5.1): 
Mya — My = 0(Ma— 8a) /(n—a). (6. 1) 
If we write ky, b/(n—a): 
Moa — My = kna( Ma — 82). (6. 2) 


That regression is also linear in the alternative dimension is too elemen- 
tary to merit formal demonstration. In virtue of (6.2) we can at 
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once employ two relations which necessarily hold good if regression is 
linear. In § 2 of Part I we have already exhibited these as tautologies 
of a grid: 

V (Moa) = kva?Va, (6. 3) 


Cov (8a, 8») — (6. 4) 
The form of the product-moment formula is now deducible: 
Tar = — kya V Va/V >. (6. 5) 


Likewise, in virtue of linear regression, we may make use of the grid 
property exhibited as such in § 2 of Part I: 


Tar? = V (Moa) /Vo = nar?. (6. 6) 
To obtain the variance of A’s score, we note that 
8a — Mz 


which is simply an expression of the fact that A’s score is the sum of 
his separate draws. We now have 


E[3(a,— M,) ]? 
= E[3(a;— M,)*] + M,) (a;—M;,)] 


which may be written 


Va (a) + aj). 
Now 
V (a) =V; 
and 
Cov (a, a;) — V,/(n—1), 


the last expression coming from (6.4) when each player draws one 
card. Further, there are a terms in the first summation, and a(a— 1) 
terms in the second. Hence we have 


V,—aV,—a(a—1)Vi/(n—1) 
or finally 
Va =a(n—a)V,/(n—1). (6. 7) 


For V, we have at once 


V,—b(n—b) Vi/(n—1). (6.8) 
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We note that this might also be derived from the variance of the score 
of a draw of a + } cards, together with the relation 


Vaso = Va + Vo + 2 Cov( sq, 8p). 
To obtain M(Vyq) we observe that (6.3) and (6.7) give us 
V (Moa) = koa? = ab?V,/(n—a)(n—1), 


whence, using 
Vo = V(Moa) + M(Voa), 


we have, with (6.8), and a little reduction, 
M(Vva) = nb(n — a — b)V,/(n—a)(n—1). (6.9) 


Since the variance of the score-sum distribution of player B is in 
fact exactly the same whether A does or does not draw antecedently, 
the semantic implications of the conclusions of § 3 above apply equally 
to representative or taxonomical scoring with respec* to the non-replace- 
ment model. With the substitution of score-sums (s, and s,) for the 
scores of successes (7, and z,), and V, (the variance of the unit sampling 
distribution) for pg (—rpq when r= 1), we have indeed established 
in the representative domain all the properties of the non-replacement 
model established in § 2 and $3 for taxonomic scoring. The latter is 
evidently a special case of the former, if n= 2 and the classes are 
respectively assigned unity (success) and zero (failure) as their definitive 
scores. 


7. PARTIAL CORRELATION AND SUCCESSIVE NON-REPLACEMENT SAMPLING 
We shall now explore situations involving selection without replace- 
ment by three players: 
U draws first u cards from a pack of n cards; 
V draws next v cards from the residual pack of (n— vu) cards; 
W draws next w cards from the residual pack of (n — u— v) cards, 
In this situation Tow, Tur, Tuw have the meaning appropriate to the 
conditions stated, and row. signifies the correlation between scores of 
players V and W when player U always draws the same u-fold set of 


cards. We shall consider three types of relation between the scores of 
the players within the framework of taxonomic scoring: 
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Case (i) The score of each player is his raw score, so that the 
relation of the score of player W to that of either V or 
U is one of consequence. 

Case (ii) The score recorded by player V or W is the sum of his 
individual score (24 and 2») and the score (zy) of 
player U. The relation of the W score to that of V is 
then purely consequential when u = 0, but is otherwise 
partly concurrent. 


Case (iii) The score recorded by player V or W is as for case (ii) 
the sum of his individual score and that of player U; 
but the latter replaces the u cards he draws, so that V 
draws from a pack of n and W from a residual pack of 
(n—v) cards. The essential difference between this 
situation and the foregoing is that there is no conse- 
quential relation between the individual scores of V or 


W and that of U. 
Case (i). We have from § 3 and our theorem of $5: 
Mu=up, Mo—wp; (7.1) 
Vu = u(n— u) pg/(n—1), ete.; (7.2) 
Cov(2u, tv) — pg/(n—1), ete.; (7.3) 
= — uv/V (n— u) (n — v), ete. (7. 4) 


For the regression of W’s score on the scores of U and V we need 
Mw uv), the mean value of the W-score associated with a particular value 
(zu) of the U-score and a particular value (z,) of the V-score. Since 
there are now (np—2,—2y) favorable cards out of a total of 
(n—u—v), we have at once 


Mocuv) = w(np — ty — Ty) /(Nn—u—v). (7.5) 


If now we write 
kwcuv) = w/(n—u—v), 


we easily get the regression equation 
Mw uv) — My (My — + kw uv) (Me Ly). (7%. 6) 
It is worth remarking that we could also write regression equations of 
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the same form for the U-score and for the V-score. We note again the 
symmetry of the situation. 

To determine the partial correlation row.u, we note that, as indicated 
in (7.4), the correlation between the scores of V and W is the same 
whether U chooses first or refrains from choosing. In other words the 
total and partial correlations of the V and W scores are implications of 
the classical formula for partial correlation, to eliminate the effect of a 
prior choice by player U. We have to evaluate 


= (Tow — TouT wu) / V (1 Tou") (1 Tww"), 


which gives, using the values from (7. 4), 


Tow.u = — Vvw/(n—u— v)(n—u—w). (7. 7) 


This is identical with (7.4) when u—0. In one sense the classical 
formula is therefore consistent with the conditions imposed, inasmuch 
as it exhibits the effect of eliminating the prior choice of player U; but 
this is not coextensive with eliminating the effect of variation with 
respect to his choice. The customary interpretation of the partial 
coefficient row. is consistent with a situation in which player U always 
takes the same set of u (> 0) cards from the pack as initially con- 
stituted. In that sense, (7.7) is not consistent with (7.4) above; and 
the classical formula does not hold good. Here we encounter an essen- 
tial semantic difference between the interpretation of a partial correlation 
coefficient in the domain of non-replacement and in the alternative 
domain of the umpire bonus model of the preceding communication of 
this series. If there is replacement, the contributory effect of elimi- 
nating the consequences of an antecedent choice is necessarily the same 
as the effect of eliminating the act of choice itself. 


Case (ti). As in the previous communication of the series, we here 
use M,, and V,. respectively to signify the grand mean and variance 
of the distribution of the individual score of player V. The meaning 
of Mw, and Vw. is referable in the same way to the individual W-score. 
The scores recorded by players V and W are the sums of their respective 
individual scores (2,4 and 2w.,) and the score (z,) of the player U, ¢. e., 


Ly = Tou + Lu and Zw = + Zu; 
My = M (av) = Ma + Mon, 


= 


“ 
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The variance of V’s score is evidently that of a player who draws u + v 
cards, or 

Vom (u+ v)(n—u—v)pq/(n—1), (7.8) 
and similarly 

Vo = (u+w)(n—u—w)pq/(n—1). (7. 9) 


For the covariance of V’s and W’s scores we have 
Cov(tv, fw) Cov( tu + Zu + Lw.u) 
= Vu+ Cov(tu, + Cov(tu, + Cov(tou, Zou); 
which reduces to 
Cov(tu, Zw) = [nu— (u+v)(u+w)pg]/(n—1). (7.10) 
For row we then have 


nu—(u-+v)(u+w) 

0) 
Before we can explore the validity of the partial correlation formula to 
obtain Tow.u, we need to find ry» and Tuw. We get, by similar algebra, 


and te (7. 12) 


n—u)(n—v) n—u)(n—w)° 


Tow 


(7. 11) 


Substituting (7.11) and (7.12) in the partial correlation formula we 
find 


n—u—v)(n—u—w)- 


We note that this is identical with (7.7%) found above for Case (i). 
Thus previous remarks concerning the validity of the partial correlation 
formula apply mutatis mutandis to Case (ii). 


Case (iii). We now suppose that player U draws without replacement 
u times from a pack of n cards, but replaces his cards before players 
V and W make their choices of v and w cards respectively without 
replacement. We shall again take for the total scores of players V and 
W the sums of their respective individual scores and that of player U, 
so that 

Ly = Tou + and Lo = Fwy + Tu 


The algebra is now essentially similar to what we have just had, except 
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that the individual scores zy, and tw. are now independent of zy. For 
the partial correlation row.. we get 


vw 
= — > (7. 14) 
i.e. the correlation between zp», and tw. in the absence of any con- 
tribution from player U. 

For Case (iii) the classical equation of partial correlation therefore 
holds good without qualification; and the reason for this is that we 
have here reverted to the postulates of the umpire bonus model by 
imposing on the common variable the condition of replacement before 
subsequent choice by the players V and W. If this is so, the effect of a 
fixed finite contribution (from player U) is exactly the same as if U 
made no contribution at all, since it merely changes the origin of the 
scores in the V-W correlation grid. When there is no replacement, 
the distinction emphasised earlier is essential, since: 


(a) the product-moment coefficient of the scores of V and W is a 
function of sampling fractions ; 


(b) a fixed contribution of U (other than zero) implies a reduction 
of the size of the finite universe from which V and W sample. 


This semantic clarification of the meaning of the partial correlation 
reinforces the cogency of remarks in an earlier communication vis a vis 
the logical limitations of any purely geometrical approach to correlation 
theory. Such an approach explicitly postulates a continuum, and hence 
implicitly an infinite universe of choice. Within such a framework of 
assumptions there is no place for the distinction we have here drawn 
between what we may mean by a partial correlation coefficient, since 
the postulate of an infinite universe is formally equivalent to the 
postulate of sampling with replacement. In other words, the geometrical 
approach necessarily excludes due consideration to the specific charac- 
teristics of mechanisms which may give rise to correlations and regres- 
sion—truly linear or otherwise—in the real world. 


8. THE BINOMIAL LOTTERY MODEL 


Closely allied to the model dealt with in the foregoing section is 
the following, which embodies a correlation arising in virtue of the 
constraint imposed by the trinomial p»+ p-)" definitive of a 
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3-class sampling distribution. For illustrative purposes our three classes 
may be hearts, diamonds and black cards of an ordinary pack. Player 
A draws an r-fold sample from a full pack with replacement of each 
card chosen before taking another, recording as his score (z,) the 
number of hearts in the sample, so that M,-=jr. Player B draws a 
sample of (r—2,) cards from an otherwise normally constituted pack 
containing no hearts and records as his score the number of diamonds 
in the sample. His mean score (My,) associated with an A-score 2, 
is therefore 4(r—z,). More generally, if A records as his score 
the number of cards of a class whose proportionate contribution to the 
full pack is pa, and B records as his score the number of cards of a 
class whose proportionate contribution to the full pack is p», the prob- 
abilities that any card B chooses will or will not belong to the latter 
class are respectively py/(1— pa) and (1— p» — pa)/(1— pa), so that: 


Mva 1— pa 1— pa’ 


_ 
M 1— pa 1—p,’ 


Mra — My = — — Ma) /(1 — pa). (8.1) 
Thus there is linear regression of the B-score on the A-score; and 
kya — Pr/(1— pa) 
V (Moa) = Va = * Pa/(1— Pa). 
By definition also: 


1— pa 1— Po (1— pa)? 
and since 
Vp = V(Mra) + M(Vra), 
we have 
Vo = rpo(1— po). 
Finally, 
Tra” = V(Mova)/Vo = PaPr/(1— pa) (1— pro), (8. 2) 


whence we may also write: 


Cov(a, 2») = —T* Po- (8. 3) 
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B Score 
2 3 


< 


2 2*2 2*3 


Fig. 2. BrvomiaL Lorrery Mope. 
Conditions are as prescribed in text. 


The relation V, —rp,(1— pp) definitive of the variance of the B-score 

distribution signifies that it is in fact independent of the antecedent 

choice of player A. In other words, as is true of the foregoing model, 

no meaningful partition of variance in terms of causality is admissible 

in this context. 

Example: A draws 3 cards from a full pack, scoring hearts as success. 
B draws from a 39-card pack containing 13 diamonds and 
scores diamonds as success. (See Fig. 2.) 


0 1 2 3 TOTAL MEAN VARIANCE 
0 8 12 6 1 27 l | 8% 
l 12 12 3 0 27 4 % 
2 6 3 0 0 9 ; % 
3 1 0 0 0 1 0 %% 
TOTAL 27 27 9 1 64 3 %6 
MEAN 1 4 0 
VARIANCE 6G 4 % % 
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3 Vo; 
= 3; Tor? = 4; 
M(Vra) =4—=M(Var);  V( Moa) = py = V(Mav) ; 
V (Moa)/Vo = 4 = V(Mar)/Va. 


9. SUMMARY 


. The previous communication of this series explored the properties 


of a replacement model of which we postulate a linear law of con- 
comitant variation but one which is consistent with linear regression 
only in exceptional circumstances. In this communication we examine 
the properties of a non-replacement model of which linear regression 
in both dimensions is an essential property. 


. The properties of such models call for a niche which we label con- 


strained to accommodate a type of relationship other than consequence 
(A contributes to B) or concurrence (C contributes both to A and 
to B). The essential feature of such a relationship is a constraint 
which imposes no limitation on the mean results of the choice of an 
agent B in virtue of the prior choice of an agent A. Unlike a truly 
consequential relationship, it has therefore a symmetry independent 
of the order of choice. 


. Our previous investigation of the replacement (umpire bonus) model 


showed that the correlation ratio is a just measure of explained 
variance only in the domain of consequence, being invalid as such 
in the domain of concurrence except for the trivial cases r— +1 
or 0. One peculiarity of the non-replacement and the related 
binomial lottery model we have here investigated is that they admit 
no meaningful partition of variance in terms of causality. 


With no restriction on linearity of regression, the classical formula 
for partial correlation holds good in all circumstances for the replace- 
ment set-up of which linear regression is, as stated, a necessary 
property; but the classical equation of partial correlation is valid 
for the non-replacement model only in a special sense, as indicated 
below. 


. In a replacement set-up it is not necessary to distinguish between 


two ways of eliminating the effect of a common variable C on which 
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two otherwise correlated scores A and B depend. We are equally 
entitled to interpret the product-moment rq». as the value of rap: 
(a) when C makes no contribution to the values of either A or B; 
(b) when C always makes the same contribution to the A and B 
scores. 
In the non-replacement set-up of this communication this distinction 
is essential. The classical equation of partial correlation correctly 
exhibits the effect of eliminating the contribution of C in the sense 
that the result obtained corresponds to what would happen if C made 
no contribution at all. It does not in fact correctly exhibit the 
effect of eliminating a constant contribution of C to A and B except 
in the trivial case when the constant contribution is zero. 


6. The conclusions stated give added force to a thesis advanced in the 
previous communication, namely the inadequacy of a generalised 
geometrical approach to clarify implications of the manifold circum- 
stances in which correlations may arise, in particular the implications 
of sampling from a finite universe. 


y 
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AGE CHANGES IN YOUNG ADULT ARMY MALES * 


BY FRANCIS E. RANDALL, Pu. D. 
U. 8. Army Quartermaster Corps 
Climatic Research Laboratory 
Lawrence, Massachusetts 


INTRODUCTION 


E have become so accustomed to thinking of growth as being a 

process which is involved in the production of the adult that 
it is extremely difficult to define properly the terminology which should 
apply to changes which occur after the individual has reached a stage 
commonly referred to as adult. Adulthood, in itself, is subject to a 
wide variety of interpretations, in that physiological, psychological, and 
physical fulfillments may be attained at widely divergent chronological 
periods. The physiological and psychological aspects may be considered 
as outside the province of this presentation, which will be confined to 
the physical aspects of development. The studies described by Gray and 
Ayers (1931), Simmons (1944), and that of the Bureau of Home 
Economics of the Department of Agriculture (O’Brien and Girshick, 
1939) have all terminated at about age 17, with various statements 
being made in the publications that 17 was considered more or less 
terminal in the chronological sequence of growth. However, in plotting 
curves of various dimensions taken on these groups, there is a con- 
siderable amount of evidence that some increase, although admittedly 
slight, still is present. The slopes of the curves plotted vary for dif- 
ferent dimensions, so it may be concluded that a given period for the 
cessation of growth does not exist, but rather that an extended period 
of time may be required for all the different portions of the body to 
have completed their development. This fact is certainly not new, inas- 
much as roentgenographic studies have clearly shown that bone growth 
ceases at different ages for different bones. The length of time beyond 
age 17 during which increases still occur which may be related to growth 
has not been clearly demonstrated. How much effect, for instance, does 
the failure of the vertebral epiphyses to close until age 25 have on the 


* Publication aided by a grant from The Viking Fund, Inc., New York. 
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stature of men? Or, disregarding epiphyseal union, when is the maxi- 
mum stature of American males attained? Is it statistically sound to 
compare a group of young adults between ages 20 and 24 with another 
group between 22 and 26 years of age? Once adult stature is reached, 
is there a period of stability maintained during which it is possible to 
group young adult males into age groups which would make them mu- 
tually comparable? The importance of these questions to the study of 
human biology is obvious, Would, for example, young men of ages 17, 
18, and 19 have the same distributions of bodily dimensions as would 
young men of ages 20, 21, and 22? 


THE POPULATION 


In order to answer the question posed above, a population of young 
men was studied. The total series consisted of 17,341 Army men, dis- 
tributed over the entire United States in a close approximation to the 
manner shown in the U. 8. Census Report for 1940, In this series there 
were approximately 3,000 each of ages 17, 18, and 19; 1,500 of ages 20 
and 21; and 1,000 of ages 22, 23, 24, 25, and 26. Owing to the wide 
distribution over the United States, and to the medical acceptability of 
the men involved, in so far as the Army was concerned, the series may 
be considered representative of the healthy American male white adult to 
a great extent. The men in ages 17 and 18 were just being inducted 
into the Army, and were without previous military experience; those 
between 19 and 26 were being separated from the Army, and had re- 
ceived from 12 to 24 months military service. Consequently, we should 
keep in mind that some differences might occur between 18 and 19 which 
were a result of the military environment. In all cases, the age given 
is that of the last birthday of the individual, thus, age 25 includes men 
between 25 and 25 years and 364 days. 


THE DIMENSIONS 


The dimensions which will be considered are as follows: 


Stature Inseam 

Weight Stature — inseam 

Head circumference Hand length 

Neck circumference Hand breadth 

Sleeve length Foot length 

Chest circumference Ball foot circumference 


Waist circumference 
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A brief definition of the methods of measurement of those dimensions 
not of use in usual anthropometry is in order. Sleeve length is obtained 
by placing the upper arm of the subject in a horizontal position, at an 
angle slightly forward of the transverse axis of the trunk, and the fore- 
arm at a 30 to 45 degree angle to the upper arm. The measurement, 
taken by tape, extends from cervicale to stylion with the tape passing 
over the olecranon process. Waist circumference is taken by tape hori- 
zontal to the floor at a level halfway between the lower costal margin 
and the iliac crests. Neck circumference is obtained with the tape just 
below the thyroid cartilage. Inseam, taken by anthropometer, extends 
from the nude crotch to the floor. Ball foot circumference is taken, by 
tape, over the heads of the metatarsals. Mean values of these dimensions 
for each of the age groups are listed in Table 1. 


COMPARISON OF MILITARY SERIES WITH BRUSH FOUNDATION SERIES 


A brief comparison with growth studies already in the literature will 
serve to establish a point of reference. The stature and weight curves of 
the Brush Foundation (Simmons, 1944) will be used. The stature of 
this group (Fig. 1) showed a decline in the rate of increase between 15, 
16, and 17, with the maximum, 1765 mm. being reached at age 17. The 
military series, at age 17, was slightly over 1724 mm. tall, with a maxi- 
mum being attained at age 23 at nearly 1751 mm. Stature, therefore, in- 
creased slowly between 17 and 23, with a very slight decrease being noted 
thereafter. 

The weight (Fig. 2) of the Brush series was also high, reaching a 
mean of 147.5 pounds, with a decrease in rate of increase appearing 
after age 15. However, the 17- and 18-year-old age groups in the mili- 
tary series were considerably below this value, 139.5 and 144.0, respec- 
tively. The interesting point to note, here, is that age 19, having had 
military service, has a mean weight which is more nearly on the curve 
of the Brush series than it is on the U. S.-wide increase rate as indi- 
cated by ages 17 and 18. Realizing that the Brush series was made up, 
to a great extent, of “ well-born” children, who should have received 
good nutrition, whereas the military 17- and 18-year groups were of 
general United States origins, the position of the 19-year group gains 
significance. The first conclusion which might be drawn is that the 
military diet and environment is beneficial to the group, at least so 
far as weight gain is concerned. Secondly, and of interest from the 
human biologist’s standpoint, it would appear that the general popula- 
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STATURE OF MILITARY SERIES 
COMPARED WITH BRUSH FOUNDATION 
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Fie. 1. STaTuRE oF MILITARY SERIES COMPARED WITH BRUSH FOUNDATION 
CHILDREN 


WEIGHT OF MILITARY SERIES 
COMPARED WITH BRUSH FOUNDATION CHILDREN 
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tion is capable of at least as much expression of weight as is that of a 
“well-born ” category if it is given the opportunity. Recalling one of 
the questions posed in the introduction concerning the mutual com- 
parability of age groups, this difference noted, which is apparently a 
result of environment, is indicative of the problems which continually 
arise in the statistical definition of populations. 


RATES OF CHANGE BETWEEN 17 AND 26 


All the dimensions studied show changes between 17 and 26. All 
except one show a positive increase, with inseam being the only one to 
show a decrease. The greatest change, as might be expected, occurs in 
weight, which increases from 139.26 pounds at age 17 up to 157.87 
pounds at age 26. Next, as might also be expected, is the waist circum- 
ference, which increases from 7%3.2cm. at 17 to 79.2 cm. at 26. Chest 
circumference is not far behind, increasing from 87.6 cm. to 93.4 cm. 
between 17 and 26. Also involved in the soft tissue increase is the neck 
circumference, which increases from 35.2 cm. at 17 to 37.0 cm. at 26. 
However, the neck appears to stabilize in the 24th year. Weight and 
waist circumference are still showing small increases at age 26, but the 
slopes are so low as to indicate that the maximum is nearly attained. 

If these dimensions are placed on a comparative basis by setting 
age 17 as a 100 per cent level of attainment (Fig. 3), weight will be 
noted to have increased to 113.36 per cent, waist circumference to 108.23 
per cent, chest circumference to 106.55 per cent, and neck circumference 
to 105.12 per cent of the 17-year level by age 26. In all these cases, it 
should be noted that the final attainment based on age 17 is not quite 
valid, inasmuch as ages 17 and 18 were not of previous military service. 
In weight, for example, age 19 is 108.55 per cent of age 17. In other 
words, an increase from 100.00 per cent to 108.55 per cent between 17 
and 19, and a subsequent increase from 108.55 per cent to 113.36 per 
cent between 19 and 26 do not form a valid curve. If the latter portion 
of the curve were projected backward to age 17, in an effort to com- 
pensate for the two different environments, age 17 might very well be 
105 to 106 per cent of what it was found to be, which would reduce the 
final weight achieved by age 26 to 107.5 per cent. This would, of 
course, change the final result for the girth measurements. 

Changes which are basically a result of bony growth show a some- 
what different picture. Stature reaches its maximum, in this series, at 
age 23, 175.1cm. Even though the mean values indicate a maximum 
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attained at age 23, there is no statistically significant change after age 
18. Consequently, evidence is strong that the American white male 
attains his adult stature, as an average, in the 18th year. 

Inseam, composed almost entirely of linear portions of the lower 
extremities, shows a slight decrease, but this is apparently a result of 
two factors; one, the deposition of fat in the perineum, and two, an 
increasing difficulty in obtaining a good crotch approximation with the 
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Fig. 3. RELATIVE INCREASES OF SOME DIMENSIONS BETWEEN AGES 17 AND 26 


bar of the anthropometer because of increasing fat deposition on the 
medial aspects of the thigh near the crotch. Stature minus inseam, 
however, shows a rather marked increase between 17 and 24, primarily 
between 17 and 20. This final “ growth ” may be a result of epiphyseal 
growth just prior to closure of the vertebral epiphyses at about age 25, 
or may also be a result of more erect postures. 

Head circumference shows its last growth phase between 17 and 19, 
with a slow, persistent trend producing another .1 inch by age 24. There 
are no indications that the continued growth eccurs in the manner which 
Hrdlicka (1936) described. 
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Of the four dimensions of the extremities (Fig. 4), the only one 
which shows any increase after 17 is the hand breadth, which increases 
two per cent between 19 and 22. This possibly is connected with the 
weight increase, but may also be a result of muscular conditioning of the 
hand, since the heads of the metacarpals have closed by the 19th year. 


RELATIVE INCREASES OF DIMENSIONS OF EXTREMITIES 
BETWEEN AGES I7 AND 26 
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PAIRED RELATIONSHIPS 


Quite often a discussion confined to mean values of separate dimensions 
may produce a very misleading understanding of the situation which 
actually exists in a population. To say that stature is constant after 
age 18 but that weight continues to increase is one thing, but this state- 
ment hasn’t answered very much of the question. Certainly, beyond this 
straightforward statement is a further question. What is the relation- 
ship of weight to stature as regards their variabilities throughout the 
period which is being considered? It might be assumed, at first, that a 
general consistency of relationship exists over the period which might 
be termed young adult. 

In order to introduce this consideration, a reference to the Brush 
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series (Fig. 5) will be in order. The sequence of the correlation coef- 
ficients between 1 and 17 years in the Brush series shows that the value 
of r begins at .625, increases rather rapidly to about .750 at age 3, re- 
mains at that level until 13 or 14, then decreases again to .650 at 17. In 
the military series, the r value at age 17 is .510, with a steady decrease 
occurring through the period studied, reaching .410 at age 26. Only 
during ages 19, 20, and 21 does it remain rather stable at .490. From 
these observations, it might be concluded that the relationship of weight 
to stature becomes less and less after age 14, at least through age 26 
(Fig. 6). Since r becomes less during successive ages in the military 
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Fie. 5. AGE CHANGES IN CORRELATION BETWEEN STATURE AND WEIGHT 


series, the regression equation, weight —a-- b-stature, showing the 
relation of weight to stature, naturally changes, with the slope of the 
regression (b) of weight on stature decreasing and the y (weight) inter- 
cept increasing, as shown in Fig. 6. 

From these observations, it would appear that, in so far as weight and 
stature are concerned, there is only a brief period between 19 and 21 
which might be considered at all stable. 
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Since it was found that weight became increasingly variable in its 
relation to stature, it should not come as a surprise that chest circum- 
ference (Fig. 7) follows the same pattern in its relation to stature. The 
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value of r decreases along a convex curve, remaining rather stable between 
17 and 21. 

Here, then, are examples of situations which serve to confuse that 
aspect of cessation of growth referred to earlier. Certainly it is clear 
that cessation is subject to question in its definition in regard to the 
entire body. 

The relation of hand breadth to hand length is the only one which 
has apparently stabilized by age 17, but even here, although r remains 
constant, the hand breadth does increase between 17 and 21. 

The final relationship studied (Fig. 8) is that between foot length and 
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Fie. 8. RELATIONS OF Foot DIMENSIONS BETWEEN AGES 17 AND 26 


ball foot circumference. The r value is highest between 20 and 23 where 
it may be considered rather stable. The interesting point to be noted 
here is that both the length and the circumference remain stable in 
dimension following age 18. Even so, r at 26 is only slightly above the 
17- and 18-year values. 


CONCLUSIONS 


It should be clear that we are dealing with a range of years over which 
some portions of the body have ceased growth, while others are still 
involved in increase in dimension, even though we may not fully agree 


| 


198 FRANCIS E. RANDALL 


on terming this increase growth. Since the increase is open to question 
in its definition, the logical conclusion then must be that the definition 
of growth must come under new consideration. Certainly, the difference 
in the terms of increase in dimension and growth is subtle, but this very 
subtlety must be realized before clarification of the terms can be attained. 
One possible way out of this dilemma is to segregate the concepts into 
two categories: the cessation of skeletal growth as defined by the closure 
of epiphyses; and the cessation of growth or increase in dimension of 
the soft tissues. One objection to this type of consideration is that 
increasing age, decrease of muscular tonicity, and physical conditioning 
may all be contributory to a change in dimension which is not a result of 
growth. Waist circumference would be notable in this respect. 

From the standpoint of the human biologist, the variability of succes- 
sive age groups should certainly serve as a warning to exert extreme care 
in the weighting of populations for comparative purposes. A common 
practice, for example, has been to group series into five-year periods, 
20-25, 25-30, 30-35, etc., which, on the surface appears quite acceptable. 
However, if the proportions of ages within the sub-groups differ to any 
marked degree, highly spurious results may be expected. 
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THE FAT/BONE INDEX AS A SEX-DIFFEREN- 
TIATING CHARACTER IN MAN 


BY EARLE L. REYNOLDS 


The Fels Research Institute for the Study of Human Development 
Antioch College, Yellow Springs, Ohio 


HERE are generally recognized sex differences in the tissue com- 
Ty position of the human body. Males, for instance, commonly have 
more muscle and bone, and less adipose tissue, than females. Such 
observations have been based on anatomic research, such as reported by 
Jacoby (1939), and Wilmer (1940), as well as on the analysis of body 
measurements, in such studies as Franzen (1929), Kornfeld and Schiller 
(1930), and McCloy (1936). 

In recent years the study of tissue differentiation has been facilitated 
by the use of the x-ray. Research on tissue growth and sex differences 
in childhood, as seen in roentgenograms, has been reported by Stuart 
(1940, 1942, 1946), Reynolds (1944, 1946, 1948), and Reynolds and 
Grote (1948). The method has been applied to the study of the growth 
of triplets (Reynolds and Schoen, 1947), to the analysis of creatinine 
excretion in children as related to muscle mass (Reynolds and Clark, 
1947), and to the measurement of obesity (Reynolds and Asakawa, 
1948). 

The present study attempts to determine to what extent a single value, 
the fat / bone index, serves to differentiate the sexes. The index is 
based on simple measurements of the breadths of fat and of bone, as seen 
in roentgenograms of the leg in 505 children and adults. 


MATERIALS AND METHODS 


The distribution of subjects, by age and sex, is shown below: 


Age level Male Female 
TH 59 47 
241 234 


An additional series of adult females........ 30 
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The children are all regular participants in the long-term longitudinal 
study of normal human growth and development being conducted by 
the Fels Research Institute (Sontag, 1946). The adults are chiefly 
staff members or parents of the children, the males having a mean age 
of 38 years, the females a mean age of 33 years, The group as a whole 
may be taken to represent an essentially “ normal ” sample of the human 
population. 

On each subject, an anteroposterior roentgenogram of the left leg was 
taken, at a six-foot focal-film distance. The position was standardized, 
with the subject standing, weight equally distributed, feet pointing 
directly forward. Other details of technique have been given in earlier 
papers. 

The shadows of fat (plus skin), muscle masses and bones are clearly 
defined on such a roentgenogram. On the x-ray film, measurements of 
tissue breadths were taken across the level of the greatest width of calf. 
Fat breadth represents the combined thickness of superficial adipose 
tissue (plus skin), medial and lateral; bone breadth represents the thick- 
ness of tibia plus fibula at this same level. From these measurements, 
the fat / bone tnderz is derived: 


Breadth of fat in mm. 
Breadth of bone in mm. 


x 100. 


There are, of course, other measurements and observations which may 
be obtained from such a roentgenogram. Some of these have been dis- 
cussed in earlier reports. The present paper will confine itself to a 
consideration of sex differences, as shown by this index, in the series of 
cases described above. 

RESULTS 


Means and standard devations for the fat / bone index, by age and 
sex, are shown in Table 1. The sexes show no over-lap in means: the 
largest value for males (44.2) is smaller than the smallest value for 
females (50.8). The mean values for males decrease with age; for 
females, the means show an increase after 134 years. The increasing 
divergence between the sexes is indicated by the critical ratios of the 
differences between the means: 


Age level Critical ratio 
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TABLE 1 


The fat/bone index 


NUMBER STANDARD 
GROUP OF CASES MEAN DEVIATION 
Males 
7.5 years 59 44.2 11.8 
10.5 years 43 41.9 12.9 
13.5 years 33 37.9 11.3 
16.5 years 17 30.4 13.2 
Adult 89 27.8 10.4 
Females 

7.5 years 47 51.2 15.0 
10.5 years 53 51.7 13.5 
13.5 years 32 50.8 15.3 
16.5 years ls 58.6 20.4 
Adult 89 61.1 19.0 


These ratios represent differences significant beyond the .01 level. 

Table 2 shows the percentage distribution of the fat / bone index, by 
age and sex. The approximate position of the mean in each distribu- 
tion is indicated. 

Each distribution pattern has been examined, to determine the actual 
efficiency of this index in sexing our sample. In the first three age- 
groups, representing children and young adolescents, a point of division 
was made at the 44-45 level. That is, indices of 44 and below were 
called “ male,” and indices of 45 and above were called “female.” Using 
this criterion, the percentage of error in sexing the first three groups is 


as follows: 
Percentage of error 


Age level Boys Girls Mean 
7.5 years 38 40 39 
10.5 years 34 24 29 
13.5 years 21 30 26 


During childhood, therefore, using the 44-45 level as a point of 
division, the fat / bone index will sex individuals with about 70 per cent 
accuracy, if the present sample be taken as representative. 

Amongst older adolescents (16.5 years) and in the adult series, the 
sexing value of the index is markedly better. The former group con- 
tains only 30 cases, but the adult series is adequate. At 16.5 years, using 
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the 44-45 level as a point of division, the percentage of error in sexing 
the group is 6 per cent for the boys and zero for the girls. In adults, 
using the same level of division, the percentage of error is 4 per cent for 
the men and 16 per cent for the women. 


TABLE 2 


Percentage distribution of the fat/bone index by age and sex 


7.5 YEARS 10.5 YEARS 13.5 YEARS 16.5 YEARS ADULT 
INDEX M F M F M F M F M F 


5-8 
9-12 
13-16 6 
17-20 12 
21-24 
25-28 
29-32 10 
33-36 15 
37-40 18 
41-44 19 
45-48 


49-52 
53-56 
57-60 
61-64 
65-68 
69-72 
73-76 
77-80 
81-84 
85-88 
89-92 
93-96 
97-100 
101-104 
105-108 
109-112 
113-116 
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(The line within each column indicates the approximate level of the mean for 
that distribution.) 


When the point of division is changed, in the adults, from the 44-45 
level to the 40-41 level, the percentage of error in sexing the adult series 
is 8 per cent for each sex. Thus, if adults with a fat / bone index of 40 
and below are designated as male, and those with a fat / bone index of 
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or 
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41 and above are designated as female, the present series will be correctly 
sexed with better than 90 per cent accuracy. 

As a partial check on the above findings, an additional series of 30 
adult women was examined. The average of this group was 34.5 years, 
with a mean fat / bone index of 60.7 + 23.8. These values are in line 
with the regular series. Three of these women (10 per cent) have a 
fat / bone index in the male range, that is, a value of 40 or lower. The 
other 90 per cent are within the area of female distribution. 


DISCUSSION 


In common with other measures of tissue differentiation which have 
been described by this laboratory, the fat / bone index has certain advan- 
tages and limitations (Reynolds, 1946). It provides information not 
available from external measurements; it is easy to obtain from a roent- 
genogram of the leg; it is well adapted to longitudinal studies of human 
development. However, the index is derived from linear measurements 
of body mass, and shares the limitations of such measures, It describes 
certain structural relations within the body, but not completely so. Used 
in conjunction with other measures of tissue differentiation, the fat / bone 
index offers another tool for the study of body structure in man. 

Nude photographs, additional roentgenograms and a series of anthro- 
pometric measurements are available on all of the children and most of 
the adults reported on in the present paper. A preliminary examina- 
tion of these materials indicates a close association between the fat / bone 
index and different types of body build, Further work on this problem 
is under way. 


SUMMARY 


Sex differences in the fat / bone index are described in a series of 505 
children and adults. The index is defined as the relation of breadth of 
fat to breadth of bone, as seen in a roentgenogram of the leg. 

The index tends to decrease with age in males, and increase with age 
in females. The mean index is significantly higher in females at each 
age-level studied, the significance of the difference between the sexes 
increasing with age. In the adult, the fat / bone index differentiates the 
sexes with 90 per cent accuracy. 

A close association between fat / bone index and body build was 
observed. 
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